perm filename PAPER.PUB[2,TES] blob
sn#009886 filedate 1972-07-28 generic text, type T, neo UTF8
00100 .SEC INTRODUCTION
00200 A study at the Stanford Artificial Intelligence Project (AI) has
00300 shown that it is more economical to prepare text on computer
00400 terminals than on typewriters for documents that are subject to
00500 revision at least once. The AI Lab has an in-house PDP-10/50
00600 Time-Sharing system with about 40 terminals, nearly all of which are
00700 of the keyboard-display type. To encourage and facilitate
00800 utilization of the computer in the publication process, the Lab
00900 provides text editing and formatting software and a variety of output
01000 media.
01100
01200 Currently available for text editing are a teletype-oriented and a
01300 display-oriented editor. Documents can be printed on a Model 37
01400 Teletype, a high-speed printer, or microfilm (the microfilm is
01500 prepared using FR-80 services purchased from a vendor). For widely
01600 circulated reports, any of these media can be used to prepare offset
01700 masters. Output via the Xerox Graphic Printer is to be implemented
01800 shortly, providing mixed user-definable type fonts and graphics.
01900
02000 The term "text formatting" applies to the processing that follows
02100 interactive text editing and precedes document printing. It includes
02200 justification, page numbering, section numbering, layout, footnote
02300 placement, and special capabilities such as index preparation and
02400 cross-referencing. Although several text formatting systems are
02500 available for the PDP-10, the desire for additional capabilities led
02600 to the development of a new kind of program which is known as a
02700 "document compiler". A prototype document compiler has been in use
02800 since the Fall of 1971; its acronym is "PUB" (PUBlication system).
02900
03000 The input to PUB is a "manuscript" file, prepared using one of the
03100 available text editors. The manuscript contains the unformatted text
03200 of the publication, plus commands and control characters that direct
03300 PUB in the formatting process. The output of PUB is a "document"
03400 file, i.e., a disk file which can be printed on one of the available
03500 output devices by standard utility programs.
03600
03700 PUB is called a "document compiler" because of several analogies
03800 between it and compilers for programming languages. Within PUB is an
03900 Algol-like language featuring macros in which the user can process
04000 integer and character string data to achieve complex formatting
04100 operations. Cross-referencing is achieved with the aid of "labels"
04200 very similar to the labels customary in programming languages.
04300 Automatic numbering of sections, figures, equations, footnotes,
04400 pages, and other entities is implemented using "counters" that are
04500 stepped and reset under control of a statement resembling the Algol
04600 FOR statement.
04700
04800 Even to the time-sharing monitor PUB appears to be a compiler. Its
04900 "source program" is the manuscript and its "object program" is the
05000 document. Monitor facilities for rapid cycling through the
05100 edit-compile-execute loop of program development have been made
05200 available in the edit-compile-print loop of document preparation.
00100 .SEC PUB LANGUAGES
00200 PUB has both a ∪text ∪language and a ∪command ∪language. Basic
00300 components of the text language are ∪characters, ∪words, ∪sentences,
00400 and ∪paragraphs. The command language includes ∪numbers, ∪strings,
00500 ∪variables, ∪expressions, ∪declarations, ∪statements, and ∪labels. A
00600 document is programmed by coordinated use of these two languages.
00700
00800 The text language adheres closely to informal conventions common in
00900 the preparation of manuscripts for publication. A word usually ends
01000 at a space or carriage-return, a sentence at a period, question mark,
01100 or exclamation mark, and a paragraph at a blank line. A programmer
01200 can specify alternate conventions if so desired.
01300
01400 During output of the document, the amount of space left between words
01500 and sentences is controlled by various mode settings and is subject
01600 to expansion by a uniform justification algorithm. Paragraph layout and
01700 indentation are specified by declarations of the command language.
01800 Intra-line formatting operations such as underlining and subscripting
01900 are specified by text control characters designated by the programmer.
02000
02100 Each line of the manuscript that begins with a specified character
02200 in column 1 is a "command line". The Period is the character that
02300 normally serves this function, but like all control characters,
02400 it may be changed by declarations of the command language.
02500 A command line generally contains command language information, but
02600 it is possible to switch to text language by use of the delimiter
02700 "}" (right curly bracket). In text language, it is possible to
02800 switch back to command language by use of a designated control
02900 character. The recommended character to serve this function is
03000 "{" (left curly bracket).
03100
03200 Each line of the manuscript that does not have the Period character
03300 in column 1 is a "text" line. A text line generally contains text
03400 language information, but it is possible to switch to command
03500 language using the "{" control character, and to switch back to
03600 text with the "}" delimiter.
03700
03800 An important statement of the command language is the ∪computed ∪text
03900 statement. Syntactically, it is any variable, constant, or
04000 parenthesized expression that occurs in isolation; most frequently,
04100 it occurs between curly brackets as a brief command embedded in
04200 a text line. The variable, constant, or parenthesized expression is
04300 evaluated, and its character string value is inserted into the
04400 document output. An example of the use of computed text is shown
04500 below:
04600 .B
04700 .VERSION ← 6 ;
04800 Fidjel Report, version no. {VERSION}, created {DATE}.
04900 .E
05000 The statement "VERSION ← 6" assigns the value "6" to the variable
05100 VERSION. The next line of text includes two computed text statements.
05200 The first outputs the value of the variable VERSION; the second
05300 outputs the value of the variable DATE, which is automatically computed
05400 by PUB. If the above manuscript were compiled on March 8, 1973, the
05500 output produced would be:
05600 .B
05700 Fidjel Report, version no. 6, created March 8, 1973.
05800 .E
00100 .SEC MACROS
00200 A sequence of PUB commands which is repeated throughout the manuscript can
00300 be abbreviated by use of the macro facility. For example, a typical
00400 sequence that occurs at the beginning of each section or chapter is:
00500 .b
00600 α.NEXT PAGE ; NEXT SECTION ;
00700 .e
00800 These commands force output to a new page, count up the page number,
00900 and count up the section number. They can be incorporated into a macro
01000 declaration as follows:
01100 .b
01200 α.MACRO SEC ⊂ NEXT PAGE ; NEXT SECTION ; ⊃
01300 .e
01400 Once this macro has been declared, it can be invoked by name in any
01500 command line:
01600 .b
01700 α.SEC
01800 .e
01900 PUB expands the macro and performs the indicated operations.
02000
00100 .SEC LABELS AND CROSS-REFERENCES
00100 .SEC FRONT AND BACK MATTER
00100 .SEC COUNTERS
00100 .SEC SECTIONING
00100 .SEC PAGE LAYOUT
00100 .SEC IMPLEMENTATION
00100 .SEC DISADVANTAGES
00100 .SEC PLANNED IMPROVEMENTS